Verbal Valency in the MT Between Related Languages

نویسندگان

  • Natalia Klyueva
  • Vladislav Kuboň
چکیده

The paper analyzes the differences in verbal valency frames between two related Slavic languages, Czech and Russian, with regard to their role in a machine translation system. The valency differences are a frequent source of translation errors. The results presented in the paper show that the number of substantially different valency frames is relatively low and that a bilingual valency dictionary containing only the differing valency frames can be used in an MT system in order to achieve a high precision of the translation of verbal valency.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incorporation of a Valency Lexicon into a TectoMT Pipeline

In this paper, we focus on the incorporation of a valency lexicon into TectoMT system for Czech-Russian language pair. We demonstrate valency errors in MT output and describe how the introduction of a lexicon influenced the translation results. Though there was no impact on BLEU score, the manual inspection of concrete cases showed some improvement.

متن کامل

Verb Argument Pairing in Czech-English Parallel Treebank

We describe CzEngVallex, a bilingual Czech-English valency lexicon which aligns verbal valency frames and their arguments. It is based on a parallel Czech-English corpus, the Prague Czech-English Dependency Treebank, where for each occurrence of a verb a reference to the underlying Czech and English valency lexicons is explicitly recorded. CzEngVallex lexicon pairs the entries (verb senses) of ...

متن کامل

CzEngVallex: Mapping Valency between Languages

This report presents a guideline for building a resource connected with the project of interlinking Czech and English verbal translational equivalents, based on a parallel, richly annotated dependency treebank containing also valency and semantic roles, namely the parallel Prague CzechEnglish Dependency Treebank. One of the main aims of this project is to create a high-quality and relatively la...

متن کامل

CzEngVallex: a Bilingual Czech-English Valency Lexicon

This paper introduces a new bilingual Czech-English verbal valency lexicon (called CzEngVallex) representing a relatively large empirical database. It includes 20,835 aligned valency frame pairs (i.e., verb senses which are translations of each other) and their aligned arguments. This new lexicon uses data from the Prague Czech-English Dependency Treebank and also takes advantage of the existin...

متن کامل

The need for MT-oriented versions of Case and Valency in MT

This paper looks at the use in machine Translation systems of the linguistic models of Case and Valency. It is argued that neither of these models was originally developed with this use in mind, and both must be adapted somewhat to meet this purpose. In particular, the traditional Valency distinction of complements and adjuncts leads to conflicts when valency frames in different languages are c...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010